Content Based Audio Retrieval Based on Hidden Markov Models Speech and Audio Processing and Recognition Final Project

نویسندگان

  • DAN ELLIS
  • MANUEL REYES
چکیده

This project consists in the implementation of a system that retrieves the five most similar audio files from an audio database when an audio file is presented as the input. I concentrated on indoor and outdoor environmental audio files. Audio is a very important kind of media that includes speech, music and various kinds of environmental noise. With the recent public access to different audio databases, the management of audio databases has become a really interesting topic. Very little work has been done in content based audio retrieval systems compared to the one done for content based image and video retrieval systems. The ideal content-based audio retrieval system should include all kind of audio files, speech, music and environmental sounds. Given the time for the realization of this project, I only concentrated on environmental audio files.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Precision ��

As one of the key methods to extract content semantics and structure from audio, automatic audio classification, especially for a speech and a music, is valuable for content-based audio retrieval, video summary and retrieval, and spoken document retrieval, etc. Because hidden Markov model (HMM) can well model audio signal’s time statistical properties, a left-right discrete HMM is proposed to c...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

Characteristics of the Use of Coupled Hidden Markov Models for Audio-Visual Polish Speech Recognition

This paper focuses on combining audio-visual signals for Polish speech recognition in conditions of highly disturbed audio speech signal. Recognition of audio-visual speech was based on combined hidden Markov models (CHMM). Described methods where developed for a single isolated command, nevertheless their effectiveness indicated that they would also work similarly in continuous audio-visual sp...

متن کامل

Lip-reading from parametric lip contours for audio- visual speech recognition

This paper describes the incorporation of a visual lip tracking and lip-reading algorithm that utilizes the affine-invariant Fourier descriptors from parametric lip contours to improve the audio-visual speech recognition systems. The audio-visual speech recognition system presented here uses parallel hidden Markov models (HMMs), where a joint decision, using an optimal decision rule, is made af...

متن کامل

An Asynchronous Hidden Markov Model for Audio-Visual Speech Recognition

This paper presents a novel Hidden Markov Model architecture to model the joint probability of pairs of asynchronous sequences describing the same event. It is based on two other Markovian models, namely Asynchronous Input/Output Hidden Markov Models and Pair Hidden Markov Models. An EM algorithm to train the model is presented, as well as a Viterbi decoder that can be used to obtain the optima...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001